10 research outputs found

    Multi-dimensional clustering in user profiling

    Get PDF
    User profiling has attracted an enormous number of technological methods and applications. With the increasing amount of products and services, user profiling has created opportunities to catch the attention of the user as well as achieving high user satisfaction. To provide the user what she/he wants, when and how, depends largely on understanding them. The user profile is the representation of the user and holds the information about the user. These profiles are the outcome of the user profiling. Personalization is the adaptation of the services to meet the user’s needs and expectations. Therefore, the knowledge about the user leads to a personalized user experience. In user profiling applications the major challenge is to build and handle user profiles. In the literature there are two main user profiling methods, collaborative and the content-based. Apart from these traditional profiling methods, a number of classification and clustering algorithms have been used to classify user related information to create user profiles. However, the profiling, achieved through these works, is lacking in terms of accuracy. This is because, all information within the profile has the same influence during the profiling even though some are irrelevant user information. In this thesis, a primary aim is to provide an insight into the concept of user profiling. For this purpose a comprehensive background study of the literature was conducted and summarized in this thesis. Furthermore, existing user profiling methods as well as the classification and clustering algorithms were investigated. Being one of the objectives of this study, the use of these algorithms for user profiling was examined. A number of classification and clustering algorithms, such as Bayesian Networks (BN) and Decision Trees (DTs) have been simulated using user profiles and their classification accuracy performances were evaluated. Additionally, a novel clustering algorithm for the user profiling, namely Multi-Dimensional Clustering (MDC), has been proposed. The MDC is a modified version of the Instance Based Learner (IBL) algorithm. In IBL every feature has an equal effect on the classification regardless of their relevance. MDC differs from the IBL by assigning weights to feature values to distinguish the effect of the features on clustering. Existing feature weighing methods, for instance Cross Category Feature (CCF), has also been investigated. In this thesis, three feature value weighting methods have been proposed for the MDC. These methods are; MDC weight method by Cross Clustering (MDC-CC), MDC weight method by Balanced Clustering (MDC-BC) and MDC weight method by changing the Lower-limit to Zero (MDC-LZ). All of these weighted MDC algorithms have been tested and evaluated. Additional simulations were carried out with existing weighted and non-weighted IBL algorithms (i.e. K-Star and Locally Weighted Learning (LWL)) in order to demonstrate the performance of the proposed methods. Furthermore, a real life scenario is implemented to show how the MDC can be used for the user profiling to improve personalized service provisioning in mobile environments. The experiments presented in this thesis were conducted by using user profile datasets that reflect the user’s personal information, preferences and interests. The simulations with existing classification and clustering algorithms (e.g. Bayesian Networks (BN), Naïve Bayesian (NB), Lazy learning of Bayesian Rules (LBR), Iterative Dichotomister 3 (Id3)) were performed on the WEKA (version 3.5.7) machine learning platform. WEKA serves as a workbench to work with a collection of popular learning schemes implemented in JAVA. In addition, the MDC-CC, MDC-BC and MDC-LZ have been implemented on NetBeans IDE 6.1 Beta as a JAVA application and MATLAB. Finally, the real life scenario is implemented as a Java Mobile Application (Java ME) on NetBeans IDE 7.1. All simulation results were evaluated based on the error rate and accuracy

    Testing and Analysis of Activities of Daily Living Data with Machine Learning Algorithms

    Get PDF
    It is estimated that 28% of European Union’s population will be aged 65 or older by 2060. Europe is getting older and this has a high impact on the estimated cost to be spent for older people. This is because, compared to the younger generation, older people are more at risk to have/face cognitive impairment, frailty and social exclusion, which could have negative effects on their lives as well as the economy of the European Union. The ‘active and independent ageing’ concept aims to support older people to live active and independent life in their preferred location and this goal can be fully achieved by understanding the older people (i.e their needs, abilities, preferences, difficulties they are facing during the day). One of the most reliable resources for such information is the Activities of Daily Living (ADL), which gives essential information about people’s lives. Understanding this kind of information is an important step towards providing the right support, facilities and care for the older population. In the literature, there is a lack of study that evaluates the performance of Machine Learning algorithms towards understanding the ADL data. This work aims to test and analyze the performance of the well known Machine Learning algorithms with ADL data

    A comparative study of selected classification accuracy in user profiling

    Get PDF
    In recent years the used of personalization in service provisioning applications has been very popular. However, effective personalization cannot be achieved without accurate user profiles. A number of classification algorithms have been used to classify user related information to create accurate user profiles. In this study four different classification algorithms which are; naive Bayesian (NB), Bayesian Networks (BN), lazy learning of Bayesian rules (LBR) and instance-based learner (IB1) are compared using a set of user profile data. According to our simulation results NB and IB1 classifiers have the highest classification accuracy with the lowest error rate

    Classification accuracy performance of NaĂŻve Bayesian (NB), Bayesian Networks (BN), Lazy Learning of Bayesian Rules(LBR) and Instance-Based Learner (IB1) - comparative study

    Get PDF
    In recent years the used of personalization in service provisioning applications has been very popular. However, effective personalization cannot be achieved without accurate user profiles. A number of classification algorithms have been used to classify user related information to create accurate user profiles. In this study four different classification algorithms which are; naive Bayesian (NB), Bayesian networks (BN), lazy learning of Bayesian rules (LBR) and instance-based learner (IB1) are compared using a set of user profile data. According to our simulation results NB and IB1 classifiers have the highest classification accuracy with the lowest error rate. The obtained simulation results have been evaluated against the existing works of support vector machines (SVMs), decision trees (DTs) and neural networks (NNs)

    Weighted instance based learner (WIBL) for user profiling

    No full text
    With an increase in web-based products and services, user profiling has created opportunities for both businesses and other organizations to provide a channel for user awareness as well as to achieve high user satisfaction. Apart from traditional collaborative and content-based methods, a number of classification and clustering algorithms have been used for user profiling. Instance Based Learner (IBL) classifier is a comprehensive form of the Nearest Neighbour (NN) algorithm and it is suitable for user profiling as users with similar profiles are likely to share similar personal interests and preferences. In IBL every attribute has an equal effect on the classification regardless of their relevance. In this paper, we proposed a weighted classification method, namely Weighted Instance Based Learner (WIBL), to build and handle user profiles. With WIBL, we introduce Per Category Feature (PCF) method to IBL in order to distinguish the effect of attributes on classification. PCF is an attribute weighting method and it assigns weights to attributes using conditional probabilities. The direct use of this method with IBL is not possible. Hence, two possible solutions were also proposed to address this problem. This study is aimed to test the performance of WIBL for user profiling. To validate the performance of WIBL, a series of computer simulations were carried out. These simulations were conducted using a large user profile database that includes 5000 training and 1000 test instances (users). Here, each user is represented with three sets of profile information; demographic, interest and preference data. The results illustrate that WIBL with PCF methods performs better than IBL on user profiling by reducing the error up to 28% on the selected dataset

    A comparative study of selected classifiers with classification accuracy in user profiling

    No full text

    Data as Digital Assets: The Case of Targeted Advertising

    No full text
    Facilitated by the growth of cloud computing, artificial intelligence (e.g. machine learning), and big data (e.g. predictive analytics), new tracking and profiling techniques have been developed. They have enabled the rise of targeted advertising, that is the provision of advertisements tailored to the tastes and habits of the user who actually views them. If targeted advertising is effective, data protection laws still apply. Most regulations look at the phenomenon from the data protection perspective, whilst in this paper it is argued that a holistic approach should be sought. Indeed, intellectual property, competition law, and consumer protection come necessarily into play. A general idea in this paper is that one should treat the data as digital assets in the users’ IP portfolio, thus leading the users to care more about the way their data are processed, shared, and sold. The starting point is the regulatory framework in Europe, with particular regard to the ePrivacy Directive. After critically analysing some international and European self-regulatory initiatives, case studies on Facebook and the use of data on sexual orientation will be presented to display how these systems work in practice if an Italian user files a claim with the Istituto di Autodisciplina Pubblicitaria that his rights have been violated. The chapter goes on to compare the Data Protection Directive and the General Data Protection Regulation, with a focus on direct marketing. Given that Google is the main actor of the targeted advertising world, it will be explained how the platform works and this work analyses its privacy policy to assess how data are treated with regard to this form of advertising. Before concluding, the chapter looks at targeted advertising from an intellectual property and competition law perspective. The chosen prism is the Facebook / WhatsApp concentration. The paper aims inter alia to evaluate whether the decision of the Commission, which authorised the concentration, would be different today, in light of the change in WhatsApp’s privacy policy allowing the use by Facebook of certain data of WhatsApp’s users. The chapter assesses, more generally, whether targeted advertising can be prevented or somehow regulated through the unfair commercial practices regime. This chapter concludes with a pragmatic proposal which aims to empower the users, yet strike a balance between their interests and rights and those of the ad networks, publishers, and advertisers (advertising companies). In general, one should recognise that the opt-in regime required by some regulators is not implemented by the targeted advertising companies; cumbersome regimes such as the notice and consent provided for by the ePrivacy Directive have been a failure. Therefore, one should impose on businesses a more reasonable opt-out mechanism, provided that the right to dissent is actually enforced (as opposed to the current practice of circumventing adblockers and similar tools) and that the information is clear, brief, and provided in an interactive and gamified way. The user has to be at the centre of the system, but data protection rules may not be the best means therefor

    Feature weighted clustering for user profiling

    No full text
    corecore